17 July 2016

Why Studying the Taxi Data of NYC?

  • The combination of ~24Gb, 140M YELLOW taxi rides and ~ 3Gb, 19M GREEN taxi rides in 2015 prepares us to deal with big data sets.
  • The longitude, latittude data recorded from each taxi's GPS at the start and the destination of each taxi ride, capturing the geo-spatial distribution of the taxi rides.

  • The time stamps of the taxi rides capture the New Yorkers' very dynamical commuting activities throughout the years, the months, the weeks and the hours.

  • The tips, taxi-fares paid for the rides capture the passenger behaviorial information in responding NYC's traffic conditions.

To Tip or Not to Tip, That's the Question

The Tipping Habbit of the New York Taxi Passengers

The goal of this talk is to explore the tipping behavior of New Yorkers through both the Yellow Taxi, Green taxi ride data in 2015. The Tips (or the lack of them) are the way how the customers/passengers express their satisfaction and dis-satisfaction. We would like to explore the factors which influence the passengers tipping habbits.

Our main characters, YELLOW cabs and its little cousin GREEN cabs.

Why There Is A Need for the GREEN Cabs (Borough Taxi)?

The GREEN taxi map is on the left.

Taxis Rides, Week Days vs Hours 2D Heat Map

GREEN taxi rides is on the left

Taxis Speed in a Week Days vs Hours 2D Heat Map

GREEN taxi rides is on the left The Hours mark is montonically increasing on the y-axis

Taxis Tip % in a Wdays vs Hours 2D Heat Map

The 2D heat maps provide us important insight upon the tipping behaviors. But we need more powerful analysis to decode the dependence of tipping behaviors.

Taxi Tip Percentages and Speeds (Green=Left)

Taxi Ratio of Passengers with No Tips

Tip Percentages for those who pay

We plot the above behaviors for different time frames to compare

We plot the durations (minutes) of each time frame

And the distances (miles)